Skip to content

Track bundle resource state sizes in telemetry (direct engine)#5199

Open
shreyas-goenka wants to merge 1 commit into
mainfrom
shreyas-goenka/state-size-telemetry
Open

Track bundle resource state sizes in telemetry (direct engine)#5199
shreyas-goenka wants to merge 1 commit into
mainfrom
shreyas-goenka/state-size-telemetry

Conversation

@shreyas-goenka
Copy link
Copy Markdown
Contributor

@shreyas-goenka shreyas-goenka commented May 6, 2026

Adds a resources_metadata field to the bundle deploy telemetry event with, per resource type, the count and the max/mean/median state size in bytes, plus the whole state file size.

Only direct deploys are measured, and collection does no marshalling, file read, or JSON parsing of its own. The direct engine already serializes each resource's state during the deploy and reconstructs it via WAL replay in Finalize; ExportStateFromData now records each entry's len(state) on the ResourceState it returns. deployCore stashes that finalized state on b.Metrics, and telemetry reads the per-resource sizes straight off the in-memory map. The whole-file size comes from a single os.Stat (no read/parse). Terraform stores state differently and is not collected (the field is absent there).

Because the metadata is direct-only it diverges across the DATABRICKS_BUNDLE_ENGINE test matrix, so the shared telemetry/deploy golden omits it; the logic is covered by unit tests.

The universe proto (resources_metadata, BundleResourcesMetadata, ResourceMetadata) is already merged, so this is ingested rather than dropped.

This pull request and its description were written by Isaac.

@shreyas-goenka shreyas-goenka force-pushed the shreyas-goenka/state-size-telemetry branch from 4a3106b to 715f019 Compare May 6, 2026 18:07
@shreyas-goenka shreyas-goenka force-pushed the shreyas-goenka/state-size-telemetry branch from 715f019 to 77ec7bc Compare May 6, 2026 18:32
@shreyas-goenka shreyas-goenka force-pushed the shreyas-goenka/state-size-telemetry branch from 77ec7bc to 85dd380 Compare May 6, 2026 18:49
@shreyas-goenka shreyas-goenka force-pushed the shreyas-goenka/state-size-telemetry branch from 85dd380 to 65bb595 Compare June 2, 2026 09:41
@shreyas-goenka shreyas-goenka force-pushed the shreyas-goenka/state-size-telemetry branch from 65bb595 to 44e4bc0 Compare June 2, 2026 10:10
@shreyas-goenka shreyas-goenka changed the title Track bundle resource counts and state file sizes in telemetry stats: Record per resource state file statistics in telemetry Jun 2, 2026
@shreyas-goenka shreyas-goenka marked this pull request as ready for review June 2, 2026 10:14
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 2, 2026

Approval status: pending

/acceptance/bundle/ - needs approval

Files: acceptance/bundle/telemetry/deploy/script
Suggested: @denik
Also eligible: @andrewnester, @pietern, @janniklasrose, @lennartkats-db, @anton-107

/bundle/ - needs approval

7 files changed
Suggested: @denik
Also eligible: @andrewnester, @pietern, @janniklasrose, @lennartkats-db, @anton-107

/libs/telemetry/ - needs approval

Files: libs/telemetry/protos/bundle_deploy.go
Suggested: @simonfaltum
Also eligible: @renaudhartert-db, @hectorcast-db, @parthban-db, @tanmay-db, @Divyansh-db, @tejaskochar-db, @mihaimitrea-db, @chrisst, @rauchy

Any maintainer (@andrewnester, @anton-107, @denik, @pietern, @simonfaltum, @renaudhartert-db) can approve all areas.
See OWNERS for ownership rules.

@eng-dev-ecosystem-bot
Copy link
Copy Markdown
Collaborator

eng-dev-ecosystem-bot commented Jun 2, 2026

Commit: 1cab854

Run: 27008239186

Comment thread bundle/phases/resources_metadata.go Outdated
db := dstate.NewDatabase("", 0)

pattern := dyn.NewPattern(dyn.Key("resources"), dyn.AnyKey(), dyn.AnyKey())
_, err := dyn.MapByPattern(b.Config.Value(), pattern, func(p dyn.Path, v dyn.Value) (dyn.Value, error) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why walk the config and not the actual state? They might not match 1-1.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To capture the state for both terraform and direct deployments. Most customers are still on terraform so this givess us approximate stats for the state sizes.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, in that case we should not call StateFileSize, since it has nothing to do with it, we should call it ConfigFileSize.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does try to approximate the state size - by calling PrepareState:

	target := cfg
	if adapter, ok := adapters[resourceType]; ok {
		state, err := adapter.PrepareState(cfg)
		if err != nil {
			return nil, fmt.Errorf("prepare state: %w", err)
		}
		target = state
	}

	// dstate.SaveState writes resource state with MarshalIndent using these
	// exact prefix/indent arguments; matching them here means each resource's
	// byte length equals len(entry.State) on disk for direct deploys.
	raw, err := json.MarshalIndent(target, "  ", " ")
	if err != nil {
		return nil, fmt.Errorf("marshal: %w", err)
	}
	return raw, nil

Comment thread bundle/phases/resources_metadata.go Outdated

var fileSize int64
if len(db.State) > 0 {
raw, mErr := json.MarshalIndent(db, "", " ")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not os.Stat actual state file to get the size?

@shreyas-goenka shreyas-goenka requested a review from denik June 3, 2026 15:28
Comment thread bundle/phases/resources_metadata.go Outdated
// dstate.SaveState writes resource state with MarshalIndent using these
// exact prefix/indent arguments; matching them here means each resource's
// byte length equals len(entry.State) on disk for direct deploys.
raw, err := json.MarshalIndent(target, " ", " ")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this will add some overhead to large bundles (e.g. python-generated one with a lot of jobs).

@shreyas-goenka shreyas-goenka force-pushed the shreyas-goenka/state-size-telemetry branch from 44e4bc0 to d16c208 Compare June 4, 2026 12:39
@shreyas-goenka shreyas-goenka changed the title stats: Record per resource state file statistics in telemetry Track bundle resource state sizes in telemetry (direct engine) Jun 4, 2026
Adds a `resources_metadata` field to the bundle deploy telemetry event with,
per resource type, the count and the max/mean/median state size in bytes, plus
the whole state file size.

Only direct deploys are measured, and collection does no marshalling, file read,
or JSON parsing of its own. The direct engine already serializes each resource's
state during the deploy and reconstructs it via WAL replay in Finalize;
ExportStateFromData now records each entry's len(state) on the ResourceState it
returns. deployCore stashes that finalized state on b.Metrics, and telemetry
reads the per-resource sizes straight off the in-memory map. The whole-file size
comes from a single os.Stat (no read/parse). Terraform stores state differently
and is not collected (the field is absent there).

Because the metadata is direct-only it diverges across the
DATABRICKS_BUNDLE_ENGINE test matrix, so the shared telemetry/deploy golden
omits it; the logic is covered by unit tests.

The universe proto (resources_metadata, BundleResourcesMetadata,
ResourceMetadata) is already merged, so this is ingested rather than dropped.

Co-authored-by: Isaac
@shreyas-goenka shreyas-goenka force-pushed the shreyas-goenka/state-size-telemetry branch from d16c208 to 1cab854 Compare June 5, 2026 09:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants